Application of Clustering in the Non-Parametric Estimation of Distribution Density
نویسندگان
چکیده
Abstract. This paper discusses a multimodal density function estimation problem of a random vector. A comparative accuracy analysis of some popular non-parametric estimators is made by using the Monte-Carlo method. The paper demonstrates that the estimation quality increases significantly if the sample is clustered (i.e., the multimodal density function is approximated by a mixture of unimodal densities), and later on, the density estimation methods are applied separately to each cluster. In this paper, the sample is clustered using the Gaussian distribution mixture model and the EM algorithm. The highest efficiency in the analysed cases was reached by using the iterative procedure proposed by Friedman for estimating a density component corresponding to each cluster after the primary sample clustering mentioned. The Friedman procedure is based on both the projection pursuit of multivariate observations and transformation of the univariate projections into the standard Gaussian random values (using the density function estimates of these projections).
منابع مشابه
Hyperbolic Cosine Log-Logistic Distribution and Estimation of Its Parameters by Using Maximum Likelihood Bayesian and Bootstrap Methods
In this paper, a new probability distribution, based on the family of hyperbolic cosine distributions is proposed and its various statistical and reliability characteristics are investigated. The new category of HCF distributions is obtained by combining a baseline F distribution with the hyperbolic cosine function. Based on the base log-logistics distribution, we introduce a new di...
متن کاملSemi-parametric Quantile Regression for Analysing Continuous Longitudinal Responses
Recently, quantile regression (QR) models are often applied for longitudinal data analysis. When the distribution of responses seems to be skew and asymmetric due to outliers and heavy-tails, QR models may work suitably. In this paper, a semi-parametric quantile regression model is developed for analysing continuous longitudinal responses. The error term's distribution is assumed to be Asymmetr...
متن کاملتخمین احتمال بزرگی زمینلغزشهای رخداده در حوزه آبخیز پیوهژن (استان خراسان رضوی)
Knowing the number, area, and frequency of landslides occurred in each area has a prominent role in the long-term evolution of area dominated by landslides and can be used for analyzing of susceptibility, hazard, and risk. In this regard, the current research is trying to consider identified landslides size probability in the Pivejan Watershed, Razavi Khorasan Province. In the first step, lands...
متن کاملA robust wavelet based profile monitoring and change point detection using S-estimator and clustering
Some quality characteristics are well defined when treated as response variables and are related to some independent variables. This relationship is called a profile. Parametric models, such as linear models, may be used to model profiles. However, in practical applications due to the complexity of many processes it is not usually possible to model a process using parametric models.In these cas...
متن کاملMoment Inequalities for Supremum of Empirical Processes of U-Statistic Structure and Application to Density Estimation
We derive moment inequalities for the supremum of empirical processes of U-Statistic structure and give application to kernel type density estimation and estimation of the distribution function for functions of observations.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006